100 research outputs found

    Covering Problems for Partial Words and for Indeterminate Strings

    Full text link
    We consider the problem of computing a shortest solid cover of an indeterminate string. An indeterminate string may contain non-solid symbols, each of which specifies a subset of the alphabet that could be present at the corresponding position. We also consider covering partial words, which are a special case of indeterminate strings where each non-solid symbol is a don't care symbol. We prove that indeterminate string covering problem and partial word covering problem are NP-complete for binary alphabet and show that both problems are fixed-parameter tractable with respect to kk, the number of non-solid symbols. For the indeterminate string covering problem we obtain a 2O(klog⁑k)+nkO(1)2^{O(k \log k)} + n k^{O(1)}-time algorithm. For the partial word covering problem we obtain a 2O(klog⁑k)+nkO(1)2^{O(\sqrt{k}\log k)} + nk^{O(1)}-time algorithm. We prove that, unless the Exponential Time Hypothesis is false, no 2o(k)nO(1)2^{o(\sqrt{k})} n^{O(1)}-time solution exists for either problem, which shows that our algorithm for this case is close to optimal. We also present an algorithm for both problems which is feasible in practice.Comment: full version (simplified and corrected); preliminary version appeared at ISAAC 2014; 14 pages, 4 figure

    Average-Case Optimal Approximate Circular String Matching

    Full text link
    Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that are at a distance at most k from x or from any of its rotations. In this article, we present a new algorithm for approximate circular string matching under the edit distance model with optimal average-case search time O(n(k + log m)/m). Optimal average-case search time can also be achieved by the algorithms for multiple approximate string matching (Fredriksson and Navarro, 2004) using x and its rotations as the set of multiple patterns. Here we reduce the preprocessing time and space requirements compared to that approach

    Computing Covers under Substring Consistent Equivalence Relations

    Full text link
    Covers are a kind of quasiperiodicity in strings. A string CC is a cover of another string TT if any position of TT is inside some occurrence of CC in TT. The shortest and longest cover arrays of TT have the lengths of the shortest and longest covers of each prefix of TT, respectively. The literature has proposed linear-time algorithms computing longest and shortest cover arrays taking border arrays as input. An equivalence relation β‰ˆ\approx over strings is called a substring consistent equivalence relation (SCER) iff Xβ‰ˆYX \approx Y implies (1) ∣X∣=∣Y∣|X| = |Y| and (2) X[i:j]β‰ˆY[i:j]X[i:j] \approx Y[i:j] for all 1≀i≀jβ‰€βˆ£X∣1 \le i \le j \le |X|. In this paper, we generalize the notion of covers for SCERs and prove that existing algorithms to compute the shortest cover array and the longest cover array of a string TT under the identity relation will work for any SCERs taking the accordingly generalized border arrays.Comment: 16 page

    Graphs Cannot Be Indexed in Polynomial Time for Sub-quadratic Time String Matching, Unless SETH Fails

    Get PDF
    The string matching problem on a node-labeled graph G= (V, E) asks whether a given pattern string P has an occurrence in G, in the form of a path whose concatenation of node labels equals P. This is a basic primitive in various problems in bioinformatics, graph databases, or networks, but only recently proven to have a O(|E||P|)-time lower bound, under the Orthogonal Vectors Hypothesis (OVH). We consider here its indexed version, in which we can index the graph in order to support time-efficient string queries. We show that, under OVH, no polynomial-time indexing scheme of the graph can support querying P in time O(| P| + | E| Ξ΄| P| Ξ²), with either Ξ΄< 1 or Ξ²< 1. As a side-contribution, we introduce the notion of linear independent-components (lic) reduction, allowing for a simple proof of our result. As another illustration that hardness of indexing follows as a corollary of a lic reduction, we also translate the quadratic conditional lower bound of Backurs and Indyk (STOC 2015) for the problem of matching a query string inside a text, under edit distance. We obtain an analogous tight quadratic lower bound for its indexed version, improving the recent result of Cohen-Addad, Feuilloley and Starikovskaya (SODA 2019), but with a slightly different boundary condition.Peer reviewe

    Circular pattern matching with k mismatches

    Get PDF
    The k-mismatch problem consists in computing the Hamming distance between a pattern P of length m and every length-m substring of a text T of length n, if this distance is no more than k. In many real-world applications, any cyclic shift of P is a relevant pattern, and thus one is interested in computing the minimal distance of every length-m substring of T and any cyclic shift of P. This is the circular pattern m

    Regulation of the let-7a-3 Promoter by NF-ΞΊB

    Get PDF
    Changes in microRNA expression have been linked to a wide array of pathological states. However, little is known about the regulation of microRNA expression. The let-7 microRNA is a tumor suppressor that inhibits cellular proliferation and promotes differentiation, and is frequently lost in tumors. We investigated the transcriptional regulation of two let-7 family members, let-7a-3 and let-7b, which form a microRNA cluster and are located 864 bp apart on chromosome 22q13.31. Previous reports present conflicting data on the role of the NF-ΞΊB transcription factor in regulating let-7. We cloned three fragments upstream of the let-7a-3/let-7b miRNA genomic region into a plasmid containing a luciferase reporter gene. Ectopic expression of subunits of NF-ΞΊB (p50 or p65/RelA) significantly increased luciferase activity in HeLa, 293, 293T and 3T3 cells, indicating that the let-7a-3/let-7b promoter is highly responsive to NF-ΞΊB. Mutation of a putative NF-ΞΊB binding site at bp βˆ’833 reduced basal promoter activity and decreased promoter activity in the presence of p50 or p65 overexpression. Mutation of a second putative binding site, at bp βˆ’947 also decreased promoter activity basally and in response to p65 induction, indicating that both sites contribute to NF-ΞΊB responsiveness. While the levels of the endogenous primary let-7a and let-7b transcript were induced in response to NF-ΞΊB overexpression in 293T cells, the levels of fully processed, mature let-7a and let-7b miRNAs did not increase. Instead, levels of Lin-28B, a protein that blocks let-7 maturation, were induced by NF-ΞΊB. Increased Lin-28B levels could contribute to the lack of an increase in mature let-7a and let-7b. Our results suggest that the final biological outcome of NF-ΞΊB activation on let-7 expression may vary depending upon the cellular context. We discuss our results in the context of NF-ΞΊB activity in repressing self-renewal and promoting differentiation

    Why High-Performance Modelling and Simulation for Big Data Applications Matters

    Get PDF
    Modelling and Simulation (M&S) offer adequate abstractions to manage the complexity of analysing big data in scientific and engineering domains. Unfortunately, big data problems are often not easily amenable to efficient and effective use of High Performance Computing (HPC) facilities and technologies. Furthermore, M&S communities typically lack the detailed expertise required to exploit the full potential of HPC solutions while HPC specialists may not be fully aware of specific modelling and simulation requirements and applications. The COST Action IC1406 High-Performance Modelling and Simulation for Big Data Applications has created a strategic framework to foster interaction between M&S experts from various application domains on the one hand and HPC experts on the other hand to develop effective solutions for big data applications. One of the tangible outcomes of the COST Action is a collection of case studies from various computing domains. Each case study brought together both HPC and M&S experts, giving witness of the effective cross-pollination facilitated by the COST Action. In this introductory article we argue why joining forces between M&S and HPC communities is both timely in the big data era and crucial for success in many application domains. Moreover, we provide an overview on the state of the art in the various research areas concerned

    MicroRNA Dysregulation in the Spinal Cord following Traumatic Injury

    Get PDF
    Spinal cord injury (SCI) triggers a multitude of pathophysiological events that are tightly regulated by the expression levels of specific genes. Recent studies suggest that changes in gene expression following neural injury can result from the dysregulation of microRNAs, short non-coding RNA molecules that repress the translation of target mRNA. To understand the mechanisms underlying gene alterations following SCI, we analyzed the microRNA expression patterns at different time points following rat spinal cord injury

    The role of adipokines in connective tissue diseases

    Full text link
    • …
    corecore